shortest-path distance
Graph Distance as Surprise: Free Energy Minimization in Knowledge Graph Reasoning
In this work, we propose that reasoning in knowledge graph (KG) networks can be guided by surprise minimization. Entities that are close in graph distance will have lower surprise than those farther apart. This connects the Free Energy Principle (FEP) from neuroscience to KG systems, where the KG serves as the agent's generative model. We formalize surprise using the shortest-path distance in directed graphs and provide a framework for KG-based agents. Graph distance appears in graph neural networks as message passing depth and in model-based reinforcement learning as world model trajectories. This work-in-progress study explores whether distance-based surprise can extend recent work showing that syntax minimizes surprise and free energy via tree structures.
- North America > United States > New York > New York County > New York City (0.15)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
- (4 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.65)
Word2VecGD: Neural Graph Drawing with Cosine-Stress Optimization
We propose a novel graph visualization method leveraging random walk-based embeddings to replace costly graph-theoretical distance computations. Using word2vec-inspired embeddings, our approach captures both structural and semantic relationships efficiently. Instead of relying on exact shortest-path distances, we optimize layouts using cosine dissimilarities, significantly reducing computational overhead. Our framework integrates differentiable stress optimization with stochastic gradient descent (SGD), supporting multi-criteria layout objectives. Experimental results demonstrate that our method produces high-quality, semantically meaningful layouts while efficiently scaling to large graphs. Code available at: https://github.com/mlyann/graphv_nn
Local Intrinsic Dimensionality for Dynamic Graph Embeddings
Knežević, Dušica, Savić, Miloš, Radovanović, Miloš
The notion of local intrinsic dimensionality (LID) has important theoretical implications and practical applications in the fields of data mining and machine learning. Recent research efforts indicate that LID measures defined for graphs can improve graph representational learning methods based on random walks. In this paper, we discuss how NC-LID, a LID measure designed for static graphs, can be adapted for dynamic networks. Focusing on dynnode2vec as the most representative dynamic graph embedding method based on random walks, we examine correlations between NC-LID and the intrinsic quality of 10 real-world dynamic network embeddings. The obtained results show that NC-LID can be used as a good indicator of nodes whose embedding vectors do not tend to preserve temporal graph structure well. Thus, our empirical findings constitute the first step towards LID-aware dynamic graph embedding methods.
- Europe > Serbia > Vojvodina > South Bačka District > Novi Sad (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
Long Range Graph Benchmark
Dwivedi, Vijay Prakash, Rampášek, Ladislav, Galkin, Mikhail, Parviz, Ali, Wolf, Guy, Luu, Anh Tuan, Beaini, Dominique
Graph Neural Networks (GNNs) that are based on the message passing (MP) paradigm generally exchange information between 1-hop neighbors to build node representations at each layer. In principle, such networks are not able to capture long-range interactions (LRI) that may be desired or necessary for learning a given task on graphs. Recently, there has been an increasing interest in development of Transformer-based methods for graphs that can consider full node connectivity beyond the original sparse structure, thus enabling the modeling of LRI. However, MP-GNNs that simply rely on 1-hop message passing often fare better in several existing graph benchmarks when combined with positional feature representations, among other innovations, hence limiting the perceived utility and ranking of Transformer-like architectures. Here, we present the Long Range Graph Benchmark (LRGB) with 5 graph learning datasets: PascalVOC-SP, COCO-SP, PCQM-Contact, Peptides-func and Peptides-struct that arguably require LRI reasoning to achieve strong performance in a given task. We benchmark both baseline GNNs and Graph Transformer networks to verify that the models which capture long-range dependencies perform significantly better on these tasks. Therefore, these datasets are suitable for benchmarking and exploration of MP-GNNs and Graph Transformer architectures that are intended to capture LRI.
- North America > Canada > Quebec > Montreal (0.14)
- Asia > Singapore (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
FedGS: Federated Graph-based Sampling with Arbitrary Client Availability
Wang, Zheng, Fan, Xiaoliang, Qi, Jianzhong, Jin, Haibing, Yang, Peizhen, Shen, Siqi, Wang, Cheng
While federated learning has shown strong results in optimizing a machine learning model without direct access to the original data, its performance may be hindered by intermittent client availability which slows down the convergence and biases the final learned model. There are significant challenges to achieve both stable and bias-free training under arbitrary client availability. To address these challenges, we propose a framework named Federated Graph-based Sampling (FedGS), to stabilize the global model update and mitigate the long-term bias given arbitrary client availability simultaneously. First, we model the data correlations of clients with a Data-Distribution-Dependency Graph (3DG) that helps keep the sampled clients data apart from each other, which is theoretically shown to improve the approximation to the optimal model update. Second, constrained by the far-distance in data distribution of the sampled clients, we further minimize the variance of the numbers of times that the clients are sampled, to mitigate long-term bias. To validate the effectiveness of FedGS, we conduct experiments on three datasets under a comprehensive set of seven client availability modes. Our experimental results confirm FedGS's advantage in both enabling a fair client-sampling scheme and improving the model performance under arbitrary client availability. Our code is available at \url{https://github.com/WwZzz/FedGS}.
Multiple Manifold Clustering Using Curvature Constrained Path
The problem of multiple surface clustering is a challenging task, particularly when the surfaces intersect. Available methods such as Isomap fail to capture the true shape of the surface nearby the intersection and result in incorrect clustering. The Isomap algorithm uses the shortest path between points. The main draw back of the shortest path algorithm is due to the lack of curvature constrained where causes to have a path between points on different surfaces. In this paper, we tackle this problem by imposing a curvature constraint to the shortest path algorithm used in Isomap. The algorithm chooses several landmark nodes at random and then checks whether there is a curvature constrained path between each landmark node and every other node in the neighbourhood graph. We build a binary feature vector for each point where each entry represents the connectivity of that point to a particular landmark. Then the binary feature vectors could be used as an input of conventional clustering algorithm such as hierarchical clustering. We apply our method to simulated and some real datasets and show, it performs comparably to the best methods such as K-manifold and spectral multi-manifold clustering.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (2 more...)
Manifold Matching using Shortest-Path Distance and Joint Neighborhood Selection
Shen, Cencheng, Vogelstein, Joshua T., Priebe, Carey E.
Matching datasets of multiple modalities has become an important task in data analysis. Existing methods often rely on the embedding and transformation of each single modality without utilizing any correspondence information, which often results in sub-optimal matching performance. In this paper, we propose a nonlinear manifold matching algorithm using shortest-path distance and joint neighborhood selection. Specifically, a joint nearest-neighbor graph is built for all modalities. Then the shortest-path distance within each modality is calculated from the joint neighborhood graph, followed by embedding into and matching in a common low-dimensional Euclidean space. Compared to existing algorithms, our approach exhibits superior performance for matching disparate datasets of multiple modalities.
- North America > United States (0.28)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine (0.94)
- Government > Military (0.68)
Euclidean Distances, soft and spectral Clustering on Weighted Graphs
We define a class of Euclidean distances on weighted graphs, enabling to perform thermodynamic soft graph clustering. The class can be constructed form the "raw coordinates" encountered in spectral clustering, and can be extended by means of higher-dimensional embeddings (Schoenberg transformations). Geographical flow data, properly conditioned, illustrate the procedure as well as visualization aspects.
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)